Template-free word spotting in low-quality manuscripts
نویسندگان
چکیده
As the OCR technique is not yet adequate for handwritten scripts with large lexicon, word spotting has been introduced as an alternative to OCR. This paper proposes a novel approach to word spotting that, instead of matching features of the word image to features extracted from predefined templates, uses the estimated posterior probability as the output of well trained classifier for spotting. Gabor features are extracted from gray scale image in order to yield higher performance on degraded, low quality document image.
منابع مشابه
Radial Line Fourier Descriptor for Segmentation-free Handwritten Word Spotting
Automatic recognition of historical handwritten manuscripts is a daunting task due to paper degradation over time. Recognition-free retrieval or word spotting is popularly used for information retrieval and digitization of the historical handwritten documents. However, the performance of word spotting algorithms depends heavily on feature detection and representation methods. Although there exi...
متن کاملLocal Binary Pattern for Word Spotting in Handwritten Historical Document
Digital libraries store images which can be highly degraded and to index this kind of images we resort to word spotting as our information retrieval system. Information retrieval for handwritten document images is more challenging due to the difficulties in complex layout analysis, large variations of writing styles, and degradation or low quality of historical manuscripts. This paper presents ...
متن کاملAdaptation des caractéristiques pseudo-Haar pour le word spotting dans les documents manuscrits
This paper addresses the problem of word spotting in handwritten documents. We propose a coarse-to-fine segmentation free approach. This approach is based on two filtering phases, which are a global filtering followed by a local filtering after changing the observation scale. The contribution of this work is the use and the adaptation of the Haarlike-features in word spotting task for each test...
متن کاملLexicon-free handwritten word spotting using character HMMs
For retrieving keywords from scanned handwritten documents, we present a word spotting system that is based on character Hidden Markov Models. In an efficient lexicon-free approach, arbitrary keywords can be spotted without pre-segmenting text lines into words. For a multi-writer scenario on the IAM off-line database as well as for two single writer scenarios on historical data sets, it is show...
متن کاملA Survey on Various Word Spotting Techniques for Content Based Document Image Retrieval
Searching documents for information and retrieval of relevant documents is a basic activity. Various tools are readily available for searching and retrieval from digital documents, but not much robust methods are available for retrieval from historic documents and old manuscripts as they are not digitized but available in scanned formats. Conventional way of retrieval from scanned document imag...
متن کامل